A step toward the foundations of data mining

نویسنده

  • Yiyu Yao
چکیده

This paper addresses some fundamental issues related to the foundations of data mining. It is argued that there is an urgent need for formal and mathematical modeling of data mining. A formal framework provides a solid basis for a systematic study of many fundamental issues, such as representations and interpretations of primitive notions of data mining, data mining algorithms, explanations and applications of data mining results. A multi-level framework is proposed for modeling data mining based on results from many related fields. Formal concepts are adopted as the primitive notion. A concept is jointly defined as a pair consisting of the intension and the extension of the concept, namely, a formula in a certain language and a subset of the universe. An object satisfies the formula of a concept if the object has the properties as specified by the formula, and the object belongs to the extension of the concept. Rules are used to describe relationships between concepts. A rule is expressed in terms of the intensions of the two concepts and is interpreted in terms of the extensions of the concepts. Several different types of rules are investigated. The usefulness and meaningfulness of discovered knowledge are examined using a utility model and an explanation model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A 3D Discrete Element Analysis of Failure Mechanism of Shallow Foundations in Rocks

In this research work, a 3D numerical modeling technique is proposed based on the 3D particle flow code in order to investigate the failure mechanism of rock foundations. Two series of footings with different geometries and areas are considered in this work. The failure mechanism obtained is similar to that of the Terzaghi’s but there is a negligible difference in between. Lastly, one equation ...

متن کامل

Competitive Intelligence Text Mining: Words Speak

Competitive intelligence (CI) has become one of the major subjects for researchers in recent years. The present research is aimed to achieve a part of the CI by investigating the scientific articles on this field through text mining in three interrelated steps. In the first step, a total of 1143 articles released between 1987 and 2016 were selected by searching the phrase "competitive intellige...

متن کامل

کاربرد ابزارهای تحلیلگر داده کاوی و متن کاوی در چابکی سازمانهای مراقبت بهداشتی و درمانی

Introduction: The word agility identified the speed and the power of responses during facing with organization internal and external matters. The health care organizations must be agile like any other organization in today fast speeding world, because being agile is an additional advantage in the competitive world. In this paper the organizations' agility, data mining, text mining, and the role...

متن کامل

Optimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines

In this paper, principles and existing feature selection methods for classifying and clustering data be introduced. To that end, categorizing frameworks for finding selected subsets, namely, search-based and non-search based procedures as well as evaluation criteria and data mining tasks are discussed. In the following, a platform is developed as an intermediate step toward developing an intell...

متن کامل

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003